Automatic Categorization Tool for Open Software Repositories
نویسندگان
چکیده
The world of Open Source software has demonstrated the remarkable appeal of communal software development. Large number of software projects can leverage, reuse, and coordinate their work through Internet and web-based technology. For example, SourceForge currently hosts about sixty thousand software systems. Similar strategies have been suggested for corporate software development, through notions like Corporate Source and Progressive Open Source [6, 7] When used in a corporate setting, infrastructures for project information sharing present new opportunities. For example, one would like to know all projects that have something in common, so that the project groups can collaborate and share their work. With thousands of projects, manually locating related projects can be difficult. Hence, we propose to use automatic software categorization to find clusters of related software projects, using only the source code from projects. Our experiments with a small set of C programs demonstrates potential for automatic categorization of software systems without human aid.
منابع مشابه
Automatic Categorization of Software Modules
The world of software has demonstrated the remarkable appeal of communal software development. Large number of software projects can leverage, reuse, and coordinate their work through internet and web-based technology. For example, Source-Forge currently hosts about sixty thousand software systems, similar strategies have suggested for corporate software development. With thousands of projects,...
متن کاملA text categorisation tool for open source communities based on semantic analysis
Open source software (OSS) projects are supported by communities interacting through software repositories and mailing lists. Thousands of contributors participate in the development of the projects although they rarely meet each other. The result is a huge archived repository with thousands of questions, answers and contributions usually difficult to explore. We propose a tool based on semanti...
متن کاملMining Software Repositories for Defect Categorization
Early detection of software defects is very important to decrease the software cost and subsequently increase the software quality. Success of software industries not only depends on gaining knowledge about software defects, but largely reflects from the manner in which information about defect is collected and used. In software industries, individuals at different levels from customers to engi...
متن کاملApproaches for Categorization of Reusable Software Components
Reuse repositories manager manages the reusable software components in different categories and needs to find the category of reusable software components. In this paper, we have used different pure and hybrid approaches to find the domain relevancy of the component to a particular domain. Probabilistic Latent Semantic Analysis (PLSA) approach, LSA, Singular Value Decomposition (SVD) technique,...
متن کاملHierarchical Categorization of Open Source Software by Online Profiles
The large amounts of freely available open source software over the Internet are fundamentally changing the traditional paradigms of software development. Efficient categorization of the massive projects for retrieving relevant software is of vital importance for Internet-based software development such as solution searching, best practices learning and so on. Many previous works have been cond...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003